Search results for "Symmetric multiprocessor system"
showing 10 items of 11 documents
Online Scheduling of Task Graphs on Hybrid Platforms
2018
Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous \(4\sqrt{m/k}\)-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs (\(m\ge k\)). We prove that no online algorithm can have a competitive ratio …
Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures
2016
This is a post-peer-review, pre-copyedit version of an article published in IEEE Transactions on Parallel and Distributed Systems. The final authenticated version is available online at: http://dx.doi.org/10.1109/TPDS.2015.2460247. [Abstract] Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able t…
SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search
2017
Computer architectures continue to develop rapidly towards massively parallel and heterogeneous systems. Thus, easily extensible yet highly efficient parallelization approaches for a variety of platforms are urgently needed. In this paper, we present SWhybrid, a hybrid computing framework for large-scale biological sequence database search on heterogeneous computing environments with multi-core or many-core processing units (PUs) based on the Smith- Waterman (SW) algorithm. To incorporate a diverse set of PUs such as combinations of CPUs, GPUs and Xeon Phis, we abstract them as SIMD vector execution units with different number of lanes. We propose a machine model, associated with a unified …
A Fast GPU-Based Motion Estimation Algorithm for H.264/AVC
2012
H.264/AVC is the most recent predictive video compression standard to outperform other existing video coding standards by means of higher computational complexity. In recent years, heterogeneous computing has emerged as a cost-efficient solution for high-performance computing. In the literature, several algorithms have been proposed to accelerate video compression, but so far there have not been many solutions that deal with video codecs using heterogeneous systems. This paper proposes an algorithm to perform H.264/AVC inter prediction. The proposed algorithm performs the motion estimation, both with full-pixel and sub-pixel accuracy, using CUDA to assist the CPU, obtaining remarkable time …
Online Scheduling of Task Graphs on Heterogeneous Platforms
2020
Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous $4\sqrt{m/k}$ 4 m / k -competitive online algorithm by Amaris et al. [1] , where $m$ m is the number of CPUs and $k$ k the number of GPUs ( $m\geq k$ m ≥ k ). We prove that no online…
Community-driven computational biology with Debian Linux
2011
Background The Open Source movement and its technologies are popular in the bioinformatics community because they provide freely available tools and resources for research. In order to feed the steady demand for updates on software and associated data, a service infrastructure is required for sharing and providing these tools to heterogeneous computing environments. Results The Debian Med initiative provides ready and coherent software packages for medical informatics and bioinformatics. These packages can be used together in Taverna workflows via the UseCase plugin to manage execution on local or remote machines. If such packages are available in cloud computing environments, the underlyin…
Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters
2008
This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the software hierarchy of the first research implementation of HeteroPBLAS. This is the first step towards the development of a parallel linear algebra package for heterogeneous computational clusters. We demonstrate the efficiency of the HeteroPBLAS programs on a homogeneous computing cluster and a heterogeneous computing cluster.
Wireless versus Wired Network-on-Chip to Enable the Multi- Tenant Multi-FPGAs in Cloud
2021
The new era of computing is not CPU-centric but enriched with all the heterogeneous computing resources including the reconfigurable fabric. In multi-FPGA architecture, either deployed within a data center or as a standalone model, inter-FPGA communication is crucial. Network-on-chip exhibits a promising performance for the integration of one FPGA. A sustainable communication architecture requires stable performance as the number of applications or users grows. Wireless network-on-chip has the potential to be that communication architecture, as it boasts the same performance capability as wired solutions in addition to its multicast capacities. We conducted an exploratory study to investiga…
Optimizing H.264/AVC interprediction on a GPU-based framework
2011
H.264/MPEG-4 part 10 is the latest standard for video compression and promises a significant advance in terms of quality and distortion compared with the commercial standards currently most in use such as MPEG-2 or MPEG-4. To achieve this better performance, H.264 adopts a large number of new/improved compression techniques compared with previous standards, albeit at the expense of higher computational complexity. In addition, in recent years new hardware accelerators have emerged, such as graphics processing units (GPUs), which provide a new opportunity to reduce complexity for a large variety of algorithms. However, current GPUs suffer from higher power consumption requirements because of…
Scalable Dense Factorizations for Heterogeneous Computational Clusters
2008
This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These routines are used in the factorization and solution of a dense system of linear equations. They are implemented using optimized PBLAS, BLACS and BLAS libraries for heterogeneous computational clusters. We present the details of the implementation as well as performance results on a heterogeneous computing cluster.